NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Efficient counter-factual type error debugging

https://doi.org/10.1016/j.scico.2020.102544

Chen, Sheng; Wu, Baijun (December 2020, Science of Computer Programming)
null (Ed.)
Full Text Available
Unsupervised Lifelong Learning with Curricula

https://doi.org/10.1145/3442381.3449839

He, Yi; Chen, Sheng; Wu, Baijun; Yuan, Xu; Wu, Xindong (April 2021, Proceedings of the Web Conference 2021)

Full Text Available
Efficient Counter-factual Type Error Debugging

https://doi.org/10.1109/TASE.2019.00-13

Chen, Sheng; Wu, Baijun (July 2019, 2019 International Symposium on Theoretical Aspects of Software Engineering)

Type inference is an important part of functional programming languages and has been increasingly adopted to imperative programming. However, providing effective error messages in response to type inference failures (due to type errors in programs) continues to be a challenge. Type error messages generated by compilers and existing error debugging approaches often point to bogus error locations or lack sufficient information for removing the type error, making error debugging ineffective. Counter-factual typing (CFT) addressed this problem by generating comprehensive error messages with each message includes a rich set of information. However, CFT has a large response time, making it too slow for interactive use. In particular, our recent study shows that programmers usually have to go through multiple iterations of updating and recompiling programs to remove a type error. Interestingly, our study also reveals that program updates are minor in each iteration during type error debugging. We exploit this fact and develop eCFT, an efficient version of CFT, which doesn't recompute all error fixes from scratch for each updated program but only recomputes error fixes that are changed in response to the update. Our key observation is that minor program changes lead to minor error suggestion changes. eCFT is based on principal typing, a typing scheme more amenable to reuse previous typing results. We have evaluated our approach and found it is about 12.4× faster than CFT in updating error fixes.
more » « less
Full Text Available
Toward Mining Capricious Data Streams: A Generative Approach

https://doi.org/10.1109/TNNLS.2020.2981386

He, Yi; Wu, Baijun; Wu, Di; Beyazit, Ege; Chen, Sheng; Wu, Xindong (January 2020, IEEE Transactions on Neural Networks and Learning Systems)
null (Ed.)
Full Text Available
Online Learning from Capricious Data Streams: A Generative Approach

https://doi.org/10.24963/ijcai.2019/346

He, Yi; Wu, Baijun; Wu, Di; Beyazit, Ege; Chen, Sheng; Wu, Xindong (August 2019, International Joint Conference on Artificial Intelligence Main track)

Learning with streaming data has received extensive attention during the past few years. Existing approaches assume the feature space is fixed or changes by following explicit regularities, limiting their applicability in dynamic environments where the data streams are described by an arbitrarily varying feature space. To handle such capricious data streams, we in this paper develop a novel algorithm, named OCDS (Online learning from Capricious Data Streams), which does not make any assumption on feature space dynamics. OCDS trains a learner on a universal feature space that establishes relationships between old and new features, so that the patterns learned in the old feature space can be used in the new feature space. Specifically, the universal feature space is constructed by leveraging the relatednesses among features. We propose a generative graphical model to model the construction process, and show that learning from the universal feature space can effectively improve performance with theoretical analysis. The experimental results demonstrate that OCDS achieves conspicuous performance on synthetic and real datasets.
more » « less
Full Text Available
Generating precise error specifications for C: a zero shot learning approach

https://doi.org/10.1145/3360586

Wu, Baijun; Campora_III, John_Peter; He, Yi; Schlecht, Alexander; Chen, Sheng (October 2019, Proceedings of the ACM on Programming Languages)

In C programs, error specifications, which specify the value range that each function returns to indicate failures, are widely used to check and propagate errors for the sake of reliability and security. Various kinds of C analyzers employ error specifications for different purposes, e.g., to detect error handling bugs, yet a general approach for generating precise specifications is still missing. This limits the applicability of those tools. In this paper, we solve this problem by developing a machine learning-based approach named MLPEx. It generates error specifications by analyzing only the source code, and is thus general. We propose a novel machine learning paradigm based on transfer learning, enabling MLPEx to require only one-time minimal data labeling from us (as the tool developers) and zero manual labeling efforts from users. To improve the accuracy of generated error specifications, MLPEx extracts and exploits project-specific information. We evaluate MLPEx on 10 projects, including 6 libraries and 4 applications. An investigation of 3,443 functions and 17,750 paths reveals that MLPEx generates error specifications with a precision of 91% and a recall of 94%, significantly higher than those of state-of-the-art approaches. To further demonstrate the usefulness of the generated error specifications, we use them to detect 57 bugs in 5 tested projects.
more » « less

Search for: All records